feat: add next! and prev! for in-place LazyNode traversal#59
Conversation
`next(o::LazyNode)` allocates a fresh `LazyNode` on every call, which is fine for occasional use but adds up sharply when a downstream package walks a large document — e.g. extracting all `Placemark` elements from a 50 MiB KML can allocate ~1 M `LazyNode` wrappers in the iterator alone (~38 MiB cumulative on a single benchmark run). Add a strictly-additive in-place pair, `next!(o)` / `prev!(o)`, that mutates `o` to point at the next/previous node and returns `o` (or `nothing` at the document boundary). Exported alongside `next` / `prev`. The aliasing trade-off is documented in the docstring: callers must not retain references to a previous position unless they explicitly snapshot with `LazyNode(o.raw)`. The existing `next` / `prev` methods are unchanged; this is purely opt-in API surface for hot paths.
|
Following up on this PR after benchmarking against v0.4 (#54): the On a 100k-Placemark synthetic walk (all rows measured today on the same host, Julia 1.12.6, Darwin aarch64):
For a typed-DOM consumer (one I've opened design issue #61 laying out a SOTA-informed two-layer StAX design to recover this performance class under v0.4's immutable design. The proposed cursor-based StAX layer (with a new |
Summary
next(::LazyNode)allocates a freshLazyNodewrapper on every call. Consumers walking large documents — e.g. extracting everyPlacemarkfrom a 50 MiB KML — can churn ~1 M wrappers per traversal (~38 MiB cumulative). This PR addsnext!(o)/prev!(o)that mutateoin place and return it, ornothingat the document boundary. Functionally equivalent too = next(o), zero per-step allocation.Why this is safe
Strictly additive:
next/prevare unchanged, callers opt in. The aliasing trade-off (ois the same object across calls, so a retained reference would silently track the new position) is documented inline; the docstring points readers needing a snapshot atLazyNode(o.raw).Why it matters
Measured on FastKML.jl extracting a
DataFramefrom a 47 MiB sample KML (~1 MRawnodes traversed): the per-step allocation site atnext(::LazyNode)was contributing ~38 MiB; switching the consumer's traversal loop tonext!drops that to zero with no functional change. Independent of (and stackable with) thenext_no_xml_spacectx fix in #58.Verification
Full test suite passes (Julia 1.12), including a new
LazyNode next! / prev!testset covering: functional equivalence withnext, identity (next!(o) === o), memoization-field reset on advance,nothingat the document boundary, andprev!symmetry.